Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sbom): export bom-ref when converting a package to a component #7340

Merged
merged 18 commits into from
Sep 19, 2024

Conversation

afdesk
Copy link
Contributor

@afdesk afdesk commented Aug 14, 2024

Description

Reproduction Steps

$ wget https://pastebin.com/raw/iD0PiatU
$ trivy sbom --format cyclonedx --scanners vuln iD0PiatU

Before:

      "affects": [
        {
          "ref": "ca36a16f-8acd-4d6a-b9d9-6e9e265bc0d8",
          "versions": [
            {
              "version": "227",
              "status": "affected"
            }
          ]

After:

"affects": [
        {
          "ref": "pkg:pypi/pywin32@227",
          "versions": [
            {
              "version": "227",
              "status": "affected"
            }
          ]
        }
      ]

Related issues

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@afdesk afdesk changed the title fix(sbom): export bom-ref fix(sbom): export bom-ref when converting a package to a component Aug 14, 2024
@afdesk afdesk marked this pull request as ready for review August 14, 2024 22:05
@afdesk afdesk requested a review from knqyf263 as a code owner August 14, 2024 22:05
@@ -1607,7 +1607,7 @@ func TestMarshaler_MarshalReport(t *testing.T) {
Updated: "2022-12-20T10:15:00+00:00",
Affects: &[]cdx.Affects{
{
Ref: "pkg:maven/com.fasterxml.jackson.core/[email protected]",
Ref: "pkg:maven/com.fasterxml.jackson.core/[email protected]?file_path=jackson-databind-2.13.4.1.jar",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't include file paths in PURL anymore. If it remains somewhere, it should be deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm removing it

@CLAassistant
Copy link

CLAassistant commented Aug 15, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ afdesk
❌ amf


amf seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@knqyf263
Copy link
Collaborator

knqyf263 commented Aug 16, 2024

We should add a new test to ensure original BOM-Refs are kept as-is. In the existing test, the input is SBOM, but the output is Trivy JSON.

{
name: "fluentd-multiple-lockfiles cyclonedx",
args: args{
input: "testdata/fixtures/sbom/fluentd-multiple-lockfiles-cyclonedx.json",
format: "json",
artifactType: "cyclonedx",
},
golden: "testdata/fluentd-multiple-lockfiles.json.golden",
},

@afdesk
Copy link
Contributor Author

afdesk commented Aug 18, 2024

We should add a new test to ensure original BOM-Refs are kept as-is.

The testcase is added now.

{
name: "scan SBOM into SBOM",
args: args{
input: "testdata/fixtures/sbom/pywin32-cyclonedx.json",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we re-use the existing json, fluentd-multiple-lockfiles-cyclonedx.json? You can change this file to use UUID in BOM-Ref.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the testcase is updated fluentd-multiple-lockfiles-cyclonedx.json.
I wanted to use fluentd-multiple-lockfiles.cdx.json.golden also, but this cdx result is a large, because it's result for fluentd-multiple-lockfiles.tar.gz. fluentd-multiple-lockfiles-cyclonedx.json is shorter.
so I've created a new golden file.

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afdesk I may have found another issue (maybe for another PR):
File iD0PiatU contains 2 same packages with different BOMRefs.
Should we show 2 BOMrefs in affects array:
something like that:

"affects": [
        {
          "ref": "pkg:pypi/pywin32@227",
          "versions": [
            {
              "version": "227",
              "status": "affected"
            }
          ]
        }
        {
          "ref": "de3ojve0eoj0j0je",
          "versions": [
            {
              "version": "227",
              "status": "affected"
            }
          ]
        }
      ]

Comment on lines 62 to 65
"bom-ref": "3ff14136-e09f-4df9-80ea-000000000006",
"type": "operating-system",
"name": "debian",
"version": "10.2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input and output bomrefs for OS are not equial:

"bom-ref": "353f2470-9c8b-4647-9d0d-96d893838dc8",
"type": "operating-system",
"name": "debian",
"version": "10.2",

@afdesk afdesk marked this pull request as draft August 23, 2024 07:42
@afdesk
Copy link
Contributor Author

afdesk commented Aug 29, 2024

Now Trivy builds new SBOM components from the result:

func (e *Encoder) resultComponent(root *core.Component, r types.Result, osFound *ftypes.OS) *core.Component {

to keep existing BOM-refs we should look for this one (if report.BOM != nil) by name and save it.
it matters for non-root and non-child components.

if we don't want to use this logic I can see 2 options:

  1. we don't keep intermediate BOM-refs. because these ones are generated UUIDs.
    Trivy still updates them for empty BOM-refs.

  2. we don't encode a result into a new BOM report, and keep existing BOM.
    just will add vuln fieldsL:
    SBOM -> Trivy -> the same SBOM+ vulnerability list.

@knqyf263 @DmitriyLewen wdyt? thanks

@DmitriyLewen
Copy link
Contributor

to keep existing BOM-refs we should look for this one (if report.BOM != nil) by name and save it.
it matters for non-root and non-child components.

I think we can try to use this way.
But we need to save operating-system here:

trivy/pkg/sbom/io/decode.go

Lines 120 to 136 in 98e136e

case core.TypeOS:
if m.osID != uuid.Nil {
onceMultiOSWarn()
continue
}
m.osID = id
sbom.Metadata.OS = &ftypes.OS{
Family: ftypes.OSType(c.Name),
Name: c.Version,
}
continue
case core.TypeApplication:
if app := m.decodeApplication(c); app.Type != "" {
m.apps[id] = app
continue
}
}

we don't keep intermediate BOM-refs.

Do you mean BOMrefs for #2?

trivy/pkg/sbom/io/encode.go

Lines 160 to 173 in 98e136e

// Container component (alpine:3.15) --------------------- #1
// -> Operating System Component (Alpine Linux 3.15) --- #2
// -> Library component (bash-4.12) ------------------ #3
// -> Library component (vim-8.2) ------------------ #3
// -> etc.
//
// Else if a package is language-specific package associated with a lock file,
// it will be a dependency of "Application" component.
// e.g.
// Container component (alpine:3.15) ------------------------ #1
// -> Application component (/app/package-lock.json) ------ #2
// -> Library component (npm package, express-4.17.3) --- #3
// -> Library component (npm package, lodash-4.17.21) --- #3
// -> etc.

we don't encode a result into a new BOM report, and keep existing BOM.
just will add vuln fieldsL:
SBOM -> Trivy -> the same SBOM+ vulnerability list.

I am not sure about this.
This will be work with Trivy reports, but we may see problem with SBOM files from other sources.

@afdesk
Copy link
Contributor Author

afdesk commented Aug 29, 2024

Do you mean BOMrefs for #2?

yes, for Application component (/app/package-lock.json)

@afdesk afdesk marked this pull request as ready for review August 30, 2024 08:40
@afdesk
Copy link
Contributor Author

afdesk commented Aug 30, 2024

@DmitriyLewen could you take a look? thanks

if c.PackageURL != "" {
purl, err = packageurl.FromString(c.PackageURL)
purl = &packageurl.PackageURL{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can remove this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that no, we have to allocate memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -283,6 +288,14 @@ func (e *Encoder) resultComponent(root *core.Component, r types.Result, osFound
component.Type = core.TypeApplication
}

// try to look for BOM-ref for this component
for _, c := range e.components {
if c.Name == component.Name && c.Type == component.Type {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we correctly store BomRefs for the Root component and packages.
Do we need this check for all types? maybe we can do this check only for Application and OperatingSystem types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@knqyf263 take a look, when you have time

@knqyf263
Copy link
Collaborator

I have refactored. @DmitriyLewen @afdesk Can you please take a look?

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knqyf263 left comments

Comment on lines 440 to 442
components := lo.MapKeys(report.BOM.Components(), func(value *core.Component, _ uuid.UUID) string {
return value.PkgIdentifier.BOMRef
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work for SPDX.

I thought about using SPDX-ID as bom-ref (see #6907 (comment)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I realized that there might be issues with non-Trivy CycloneDX reports:
Bom-Ref is an optional field - https://cyclonedx.org/docs/1.6/json/#components_items_bom-ref
So there might be a case where bom-ref doesn't exist - so we won't add vulnerabilities for these components.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bom-Ref is an optional field - https://cyclonedx.org/docs/1.6/json/#components_items_bom-ref

It's interesting. I was not aware of that. However, vulnerabilities.affects.ref must be BOM-Ref. Do you think we should generate BOM-Refs in this case?
https://cyclonedx.org/docs/1.6/json/#vulnerabilities_items_affects_items_ref

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work for SPDX.

Hmm. I didn't think SPDX supports vulnerabilities, but I just remember we added support. I'll think about it.
#7213

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, vulnerabilities.affects.ref must be BOM-Re

This is a weird case. I still don't understand why Bom-ref is not a required field.

Looks like we need to separate the possible cases:

  • sbom result without vulnerabilities
  • vuln scanner is disabled - we can skip adding bom-ref to preserve the scanned sbom file as much as possible.
  • Trivy found no vulnerabilities - I'm not sure what we should do. I'm more inclined to think that we still need to add bom-refs (use the same logic when vuln scanner is enabled)
  • sbom result with vulnerabilities - vulnerabilities.affects.ref is a required field => we need to populate this field => we need to use the same bom-ref for the linked component.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree BOM-Ref should be required.

Looks like we need to separate the possible cases:

To keep things simple, IMO, it is better to generate a BOM-Ref if it's missing in CycloneDX that Trivy is scanning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since I don't have statistics on how often CycloneDX files without bom-refs are used - it's hard for me to make a choice.
I'm not against your decision. Let's do it this way and get feedback from users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to reuse the SBOM of the input as much as possible, but I found that my implementation did not work well when using --pkg-relationships or --pkg-types with trivy sbom.

@@ -165,6 +176,7 @@ func TestSBOM(t *testing.T) {
// Run "trivy sbom"
runTest(t, osArgs, tt.golden, outputFile, types.Format(tt.args.format), runOptions{
override: overrideFuncs(overrideSBOMReport, overrideUID, tt.override),
fakeUUID: tt.fakeUUID,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we only use fakeUUID for serialNumber:

"serialNumber": "urn:uuid:3ff14136-e09f-4df9-80ea-000000000006",

This might be confusing (that we don't overwrite other UUID's).
Maybe we just don't check serialNumber?
@knqyf263 @afdesk wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serial number is generated by UUID. I think it's better to confirm it is generated as expected rather than ignoring it.

pkg/sbom/core/bom.go Outdated Show resolved Hide resolved
@knqyf263
Copy link
Collaborator

I reverted my changes. I'll open another PR.

@knqyf263 knqyf263 added this pull request to the merge queue Sep 18, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 18, 2024
@knqyf263 knqyf263 added this pull request to the merge queue Sep 18, 2024
github-merge-queue bot pushed a commit that referenced this pull request Sep 18, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 18, 2024
@knqyf263 knqyf263 added this pull request to the merge queue Sep 18, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 18, 2024
@knqyf263 knqyf263 added this pull request to the merge queue Sep 19, 2024
Merged via the queue into aquasecurity:main with commit 5dd94eb Sep 19, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unmatched Vulnerabilities.affects.ref when scanning CycloneDX sbom with duplicate Purls
4 participants